Adam Guenoun, Indira Martinez, Nicholas Solis
Within this analysis, we’ll investigate factors correlated to diabetes. With a data set of 100,000 people, this investigation allows us to display relations between ages, HbA1c levels, smoking history, and glucose levels. With a wide range of data points, we begin to question if there are trends within this data that match our general understanding of diabetes. Our goal is to asses which of the 9 variables play a stronger role to the development of diabetes and if we can prove trends to better support our assumptions of this data. Through data visualization, chart analysis, and numerical analysis we will be able to present this data to convicne a general audience of the important factors that contribute to diabteic trends.
| gender | age | hypertension | heart_disease | smoking_history | bmi | HbA1c_level | blood_glucose_level | diabetes |
|---|---|---|---|---|---|---|---|---|
| Female | 80 | 0 | 1 | never | 25.19 | 6.6 | 140 | 0 |
| Female | 54 | 0 | 0 | No Info | 27.32 | 6.6 | 80 | 0 |
| Male | 28 | 0 | 0 | never | 27.32 | 5.7 | 158 | 0 |
| Female | 36 | 0 | 0 | current | 23.45 | 5.0 | 155 | 0 |
| Male | 76 | 1 | 1 | current | 20.14 | 4.8 | 155 | 0 |
| Female | 20 | 0 | 0 | never | 27.32 | 6.6 | 85 | 0 |
For this part we will focus only on the HbA1c levels for males and females.
Overall, blood sugar regulation patterns appear balanced between both genders.
| gender | diabetes | HbA1c_level | HbA1c_category |
|---|---|---|---|
| Female | 0 | 6.6 | Diabetic ≥ 6.5% |
| Female | 0 | 6.6 | Diabetic ≥ 6.5% |
| Male | 0 | 5.7 | Prediabetic 5.7% - 6.4% |
| Female | 0 | 5.0 | Normal < 5.7% |
| Male | 0 | 4.8 | Normal < 5.7% |
| gender | HbA1c_category | n | percent |
|---|---|---|---|
| Female | Diabetic ≥ 6.5% | 11835 | 20.21280 |
| Female | Normal < 5.7% | 22492 | 38.41372 |
| Female | Prediabetic 5.7% - 6.4% | 24225 | 41.37348 |
| Male | Diabetic ≥ 6.5% | 8959 | 21.62443 |
| Male | Normal < 5.7% | 15358 | 37.06976 |
| age | diabetes | heart_disease | hypertension | group |
|---|---|---|---|---|
| 57 | 1 | 1 | 1 | Diabetes, H.D, and Hyp. |
| 62 | 1 | 1 | 1 | Diabetes, H.D, and Hyp. |
| 62 | 1 | 1 | 1 | Diabetes, H.D, and Hyp. |
| 67 | 1 | 1 | 1 | Diabetes, H.D, and Hyp. |
| 72 | 1 | 1 | 1 | Diabetes, H.D, and Hyp. |
| age | heart_disease | diabetes | hypertension | group |
|---|---|---|---|---|
| 54 | 0 | 0 | 0 | Free of Diabetes, H.D, and Hyp. |
| 28 | 0 | 0 | 0 | Free of Diabetes, H.D, and Hyp. |
| 36 | 0 | 0 | 0 | Free of Diabetes, H.D, and Hyp. |
| 20 | 0 | 0 | 0 | Free of Diabetes, H.D, and Hyp. |
| 79 | 0 | 0 | 0 | Free of Diabetes, H.D, and Hyp. |
Shows the distribution of BMI values based on hypertension status. A violin plot is great for visualizing the distribution and density of BMI across hypertension categories,
Shape and width: The width of each “violin” represents the density of BMI values at different levels. Wider sections mean more individuals have that BMI, while narrower sections indicate fewer people at those values.
Comparison of distributions: The blue violin represents people without hypertension (hypertension = 0), while the red violin represents those with hypertension (hypertension = 1). By comparing them, you can see how BMI differs between these groups.
The horizontal line around 25 BMI: This marks the median BMI for each group. Since both violins have a horizontal line in roughly the same position, it suggests that the median BMI is around 25 for both hypertensive and non-hypertensive individuals.
Density trends: If the violins have different thicknesses in certain BMI ranges, it tells you which BMI values are more or less common in each group. People with hypertension seem to have a higher BMI overall, but both groups share a similar median.
The distribution shape is different—for example, if one violin is wider at higher BMI values, it suggests that hypertension is more common among individuals with higher BMI.
Outliers or extreme values might appear as small bulges or extended tails at the ends of the violins, showing individuals with very high or low BMI.
| gender | age | hypertension | heart_disease | smoking_history | bmi | HbA1c_level | blood_glucose_level | diabetes |
|---|---|---|---|---|---|---|---|---|
| Female | 80 | 0 | 1 | never | 25.19 | 6.6 | 140 | 0 |
| Female | 54 | 0 | 0 | No Info | 27.32 | 6.6 | 80 | 0 |
| Male | 28 | 0 | 0 | never | 27.32 | 5.7 | 158 | 0 |
| Female | 36 | 0 | 0 | current | 23.45 | 5.0 | 155 | 0 |
| Male | 76 | 1 | 1 | current | 20.14 | 4.8 | 155 | 0 |
The graph below is separated by whether or not a person has hypertension. With the comparison of BMI as the range, it’s seen that majority of people with and without hypertension lie within a BMI range of 25-29. Notice that for people with hypertension, the desnity population above the red line is greater than that of people without hypertension; indicating that there’s a larger of population of people with hypertension that have a larger BMI
| age | diabetes | blood_glucose_level |
|---|---|---|
| 80 | No Diabetes | 140 |
| 54 | No Diabetes | 80 |
| 28 | No Diabetes | 158 |
| 36 | No Diabetes | 155 |
| 76 | No Diabetes | 155 |
| age | bmi | diabetes | condition |
|---|---|---|---|
| 44 | 19.31 | 1 | Diabetes Only |
| 67 | 27.32 | 1 | Diabetes Only |
| 50 | 27.32 | 1 | Diabetes Only |
| 73 | 25.91 | 1 | Diabetes Only |
| 53 | 27.32 | 1 | Diabetes Only |
| age | bmi | heart_disease | condition |
|---|---|---|---|
| 80 | 25.19 | 1 | Heart Disease Only |
| 76 | 20.14 | 1 | Heart Disease Only |
| 72 | 27.94 | 1 | Heart Disease Only |
| 67 | 27.32 | 1 | Heart Disease Only |
| 77 | 32.02 | 1 | Heart Disease Only |
Each person within this scale has heart disease. Here a comparison is made between declared underweight and overweight people, grouped by sex, based on a BMI scale. There’s a significant increase in population percentage for those who are considered overweight and that have heart disease. With visual aid, it can be concluded that as weight increases, chances of heart disease will increase.
The data here is heavily dependent on BMI scale. It is important to note that BMI is not really a great determination for those who have diabetes, but there is a general trend within the data that people who have a BMI over 30 are more likely to be diabetic.
This depicts the different categories of HbA1c levels and their
relation to patients hypertension status
This graph shows the population density of men based on diabetes status, based on age range
This graph shows the population density of women based on diabetes status, based on age range